FARANE-Q: Fast Parallel and Pipeline Q-Learning Accelerator for Configurable Reinforcement Learning SoC

نویسندگان

چکیده

This paper proposes a FAst paRAllel and pipeliNE Q-learning accelerator (FARANE-Q) for configurable Reinforcement Learning (RL) algorithm implemented in System on Chip (SoC). The proposed work offers flexibility, configurability, scalability while maintaining computation speed accuracy to overcome the challenges of dynamic environment increasing complexity. method includes Hardware/Software (HW/SW) design methodology SoC architecture achieve flexibility. We also propose joint optimizations algorithm, architecture, implementation obtain optimum (high efficiency) performance, specifically energy area efficiency. Furthermore, we real-time Zynq Ultra96-V2 FPGA platform evaluate functionality with an actual use case smart navigation. Experimental results confirm that FARANE-Q outperforms state-of-the-art works by achieving throughput up 148.55 MSps. It corresponds efficiency 1747.64 MSps/W per agent 32-bit 2424.33 16-bit FARANE-Q. Moreover, other related improvement at least $1.23\times $ designed system maintains error less than 0.4% optimized bit precision more eight fraction bits. processing time notation="LaTeX">$1795\times compared embedded SW executed ARM processor notation="LaTeX">$280\times full software i7 processor. Hence, has potential be used navigation, robotic control, predictive maintenance.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Q-learning for history-based reinforcement learning

We extend the Q-learning algorithm from the Markov Decision Process setting to problems where observations are non-Markov and do not reveal the full state of the world i.e. to POMDPs. We do this in a natural manner by adding `0 regularisation to the pathwise squared Q-learning objective function and then optimise this over both a choice of map from history to states and the resulting MDP parame...

متن کامل

Deep Reinforcement Learning with Double Q-Learning

The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether this harms performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-le...

متن کامل

Q-Decomposition for Reinforcement Learning Agents

The paper explores a very simple agent design method called Q-decomposition, wherein a complex agent is built from simpler subagents. Each subagent has its own reward function and runs its own reinforcement learning process. It supplies to a central arbitrator the Q-values (according to its own reward function) for each possible action. The arbitrator selects an action maximizing the sum of Q-v...

متن کامل

Learning mixed behaviours with parallel Q-learning

This paper presents a reinforcement learning algorithm based on a parallel approach of the Watkins’s Q-Learning. This algorithm is used to control a two axis micro-manipulator system. The aim is to learn complex behaviours as reaching target positions and avoiding obstacles at the same time. The simulations and the tests with the real manipulator show that this algorithm is able to learn simult...

متن کامل

Shared Learning : Enhancing Reinforcement in $Q$-Ensembles

Deep Reinforcement Learning has been able to achieve amazing successes in a variety of domains from video games to continuous control by trying to maximize the cumulative reward. However, most of these successes rely on algorithms that require a large amount of data to train in order to obtain results on par with human-level performance. This is not feasible if we are to deploy these systems on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2022.3232853